interactive agent
EvolveGraph: Multi-Agent Trajectory Prediction with Dynamic Relational Reasoning
Multi-agent interacting systems are prevalent in the world, from purely physical systems to complicated social dynamic systems. In many applications, effective understanding of the situation and accurate trajectory prediction of interactive agents play a significant role in downstream tasks, such as decision making and planning. In this paper, we propose a generic trajectory forecasting framework (named EvolveGraph) with explicit relational structure recognition and prediction via latent interaction graphs among multiple heterogeneous, interactive agents. Considering the uncertainty of future behaviors, the model is designed to provide multi-modal prediction hypotheses. Since the underlying interactions may evolve even with abrupt changes, and different modalities of evolution may lead to different outcomes, we address the necessity of dynamic relational reasoning and adaptively evolving the interaction graphs. We also introduce a double-stage training pipeline which not only improves training efficiency and accelerates convergence, but also enhances model performance. The proposed framework is evaluated on both synthetic physics simulations and multiple real-world benchmark datasets in various areas. The experimental results illustrate that our approach achieves state-of-the-art performance in terms of prediction accuracy.
VC-Agent: An Interactive Agent for Customized Video Dataset Collection
Zhang, Yidan, Xu, Mutian, Hao, Yiming, Zhou, Kun, Chang, Jiahao, Liu, Xiaoqiang, Wan, Pengfei, Fu, Hongbo, Han, Xiaoguang
Facing scaling laws, video data from the internet becomes increasingly important. However, collecting extensive videos that meet specific needs is extremely labor-intensive and time-consuming. In this work, we study the way to expedite this collection process and propose VC-Agent, the first interactive agent that is able to understand users' queries and feedback, and accordingly retrieve/scale up relevant video clips with minimal user input. Specifically, considering the user interface, our agent defines various user-friendly ways for the user to specify requirements based on textual descriptions and confirmations. As for agent functions, we leverage existing multi-modal large language models to connect the user's requirements with the video content. More importantly, we propose two novel filtering policies that can be updated when user interaction is continually performed. Finally, we provide a new benchmark for personalized video dataset collection, and carefully conduct the user study to verify our agent's usage in various real scenarios. Extensive experiments demonstrate the effectiveness and efficiency of our agent for customized video dataset collection. Project page: https://allenyidan.github.io/vcagent_page/.
- Asia > China > Guangdong Province > Shenzhen (0.42)
- Asia > China > Hong Kong (0.41)
- Asia > China > Beijing > Beijing (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Research Report (1.00)
- Overview (1.00)
- Questionnaire & Opinion Survey (0.87)
- Media > Film (0.92)
- Leisure & Entertainment (0.92)
Interactive Agents to Overcome Ambiguity in Software Engineering
Vijayvargiya, Sanidhya, Zhou, Xuhui, Yerukola, Akhila, Sap, Maarten, Neubig, Graham
AI agents are increasingly being deployed to automate tasks, often based on ambiguous and underspecified user instructions. Making unwarranted assumptions and failing to ask clarifying questions can lead to suboptimal outcomes, safety risks due to tool misuse, and wasted computational resources. In this work, we study the ability of LLM agents to handle ambiguous instructions in interactive code generation settings by evaluating proprietary and open-weight models on their performance across three key steps: (a) leveraging interactivity to improve performance in ambiguous scenarios, (b) detecting ambiguity, and (c) asking targeted questions. Our findings reveal that models struggle to distinguish between well-specified and underspecified instructions. However, when models interact for underspecified inputs, they effectively obtain vital information from the user, leading to significant improvements in performance and underscoring the value of effective interaction. Our study highlights critical gaps in how current state-of-the-art models handle ambiguity in complex software engineering tasks and structures the evaluation into distinct steps to enable targeted improvements.
- Oceania > Australia (0.04)
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
- Research Report > Experimental Study (0.68)
- Research Report > New Finding (0.66)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.98)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.73)
GRETA: Modular Platform to Create Adaptive Socially Interactive Agents
Grimaldi, Michele, Woo, Jieyeon, Boucaud, Fabien, Galland, Lucie, Younsi, Nezih, Yang, Liu, Fares, Mireille, Graux, Sean, Gauthier, Philippe, Pelachaud, Catherine
The interaction between humans is very complex to describe since it is composed of different elements from different modalities such as speech, gaze, and gestures influenced by social attitudes and emotions. Furthermore, the interaction can be affected by some features which refer to the interlocutor's state. Actual Socially Interactive Agents SIAs aim to adapt themselves to the state of the interaction partner. In this paper, we discuss this adaptation by describing the architecture of the GRETA platform which considers external features while interacting with humans and/or another ECA and process the dialogue incrementally. We illustrate the new architecture of GRETA which deals with the external features, the adaptation, and the incremental approach for the dialogue processing.
Human-like Nonverbal Behavior with MetaHumans in Real-World Interaction Studies: An Architecture Using Generative Methods and Motion Capture
Chojnowski, Oliver, Eberhard, Alexander, Schiffmann, Michael, Müller, Ana, Richert, Anja
Socially interactive agents are gaining prominence in domains like healthcare, education, and service contexts, particularly virtual agents due to their inherent scalability. To facilitate authentic interactions, these systems require verbal and nonverbal communication through e.g., facial expressions and gestures. While natural language processing technologies have rapidly advanced, incorporating human-like nonverbal behavior into real-world interaction contexts is crucial for enhancing the success of communication, yet this area remains underexplored. One barrier is creating autonomous systems with sophisticated conversational abilities that integrate human-like nonverbal behavior. This paper presents a distributed architecture using Epic Games MetaHuman, combined with advanced conversational AI and camera-based user management, that supports methods like motion capture, handcrafted animation, and generative approaches for nonverbal behavior. We share insights into a system architecture designed to investigate nonverbal behavior in socially interactive agents, deployed in a three-week field study in the Deutsches Museum Bonn, showcasing its potential in realistic nonverbal behavior research.
- Europe > Germany > North Rhine-Westphalia > Cologne Region > Cologne (0.05)
- North America > United States > New York > New York County > New York City (0.05)
- North America > United States > Texas > Brazos County > College Station (0.04)
- (3 more...)
Estuary: A Framework For Building Multimodal Low-Latency Real-Time Socially Interactive Agents
Lin, Spencer, Rizk, Basem, Jun, Miru, Artze, Andy, Sullivan, Caitlin, Mozgai, Sharon, Fisher, Scott
The rise in capability and ubiquity of generative artificial intelligence (AI) technologies has enabled its application to the field of Socially Interactive Agents (SIAs). Despite rising interest in modern AI-powered components used for real-time SIA research, substantial friction remains due to the absence of a standardized and universal SIA framework. To target this absence, we developed Estuary: a multimodal (text, audio, and soon video) framework which facilitates the development of low-latency, real-time SIAs. Estuary seeks to reduce repeat work between studies and to provide a flexible platform that can be run entirely off-cloud to maximize configurability, controllability, reproducibility of studies, and speed of agent response times. We are able to do this by constructing a robust multimodal framework which incorporates current and future components seamlessly into a modular and interoperable architecture.
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.88)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.73)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.49)
EvolveGraph: Multi-Agent Trajectory Prediction with Dynamic Relational Reasoning
Multi-agent interacting systems are prevalent in the world, from purely physical systems to complicated social dynamic systems. In many applications, effective understanding of the situation and accurate trajectory prediction of interactive agents play a significant role in downstream tasks, such as decision making and planning. In this paper, we propose a generic trajectory forecasting framework (named EvolveGraph) with explicit relational structure recognition and prediction via latent interaction graphs among multiple heterogeneous, interactive agents. Considering the uncertainty of future behaviors, the model is designed to provide multi-modal prediction hypotheses. Since the underlying interactions may evolve even with abrupt changes, and different modalities of evolution may lead to different outcomes, we address the necessity of dynamic relational reasoning and adaptively evolving the interaction graphs.
Combining Cognitive and Generative AI for Self-explanation in Interactive AI Agents
Sushri, Shalini, Dass, Rahul, Basappa, Rhea, Lu, Hong, Goel, Ashok
The Virtual Experimental Research Assistant (VERA) is an inquiry-based learning environment that empowers a learner to build conceptual models of complex ecological systems and experiment with agent-based simulations of the models. This study investigates the convergence of cognitive AI and generative AI for self-explanation in interactive AI agents such as VERA. From a cognitive AI viewpoint, we endow VERA with a functional model of its own design, knowledge, and reasoning represented in the Task--Method--Knowledge (TMK) language. From the perspective of generative AI, we use ChatGPT, LangChain, and Chain-of-Thought to answer user questions based on the VERA TMK model. Thus, we combine cognitive and generative AI to generate explanations about how VERA works and produces its answers. The preliminary evaluation of the generation of explanations in VERA on a bank of 66 questions derived from earlier work appears promising.
- North America > United States (0.14)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (4 more...)
- Research Report > New Finding (0.48)
- Research Report > Experimental Study (0.34)
- Education > Educational Setting > Online (0.68)
- Education > Educational Technology > Educational Software > Computer Based Training (0.46)
Socially Interactive Agents for Robotic Neurorehabilitation Training: Conceptualization and Proof-of-concept Study
Arora, Rhythm, Prajod, Pooja, Nicora, Matteo Lavit, Panzeri, Daniele, Tauro, Giovanni, Vertechy, Rocco, Malosio, Matteo, André, Elisabeth, Gebhard, Patrick
Individuals with diverse motor abilities often benefit from intensive and specialized rehabilitation therapies aimed at enhancing their functional recovery. Nevertheless, the challenge lies in the restricted availability of neurorehabilitation professionals, hindering the effective delivery of the necessary level of care. Robotic devices hold great potential in reducing the dependence on medical personnel during therapy but, at the same time, they generally lack the crucial human interaction and motivation that traditional in-person sessions provide. To bridge this gap, we introduce an AI-based system aimed at delivering personalized, out-of-hospital assistance during neurorehabilitation training. This system includes a rehabilitation training device, affective signal classification models, training exercises, and a socially interactive agent as the user interface. With the assistance of a professional, the envisioned system is designed to be tailored to accommodate the unique rehabilitation requirements of an individual patient. Conceptually, after a preliminary setup and instruction phase, the patient is equipped to continue their rehabilitation regimen autonomously in the comfort of their home, facilitated by a socially interactive agent functioning as a virtual coaching assistant. Our approach involves the integration of an interactive socially-aware virtual agent into a neurorehabilitation robotic framework, with the primary objective of recreating the social aspects inherent to in-person rehabilitation sessions. We also conducted a feasibility study to test the framework with healthy patients. The results of our preliminary investigation indicate that participants demonstrated a propensity to adapt to the system. Notably, the presence of the interactive agent during the proposed exercises did not act as a source of distraction; instead, it positively impacted users' engagement.
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Germany > Saarland > Saarbrücken (0.04)
- (4 more...)
Social Robot Mediator for Multiparty Interaction
Adikari, Manith, Cangelosi, Angelo, Gomez, Randy
A social robot acting as a 'mediator' can enhance interactions between humans, for example, in fields such as education and healthcare. A particularly promising area of research is the use of a social robot mediator in a multiparty setting, which tends to be the most applicable in real-world scenarios. However, research in social robot mediation for multiparty interactions is still emerging and faces numerous challenges. This paper provides an overview of social robotics and mediation research by highlighting relevant literature and some of the ongoing problems. The importance of incorporating relevant psychological principles for developing social robot mediators is also presented. Additionally, the potential of implementing a Theory of Mind in a social robot mediator is explored, given how such a framework could greatly improve mediation by reading the individual and group mental states to interact effectively.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > United States > New York > New York County > New York City (0.05)
- (12 more...)
- Overview (1.00)
- Research Report (0.82)
- Information Technology > Artificial Intelligence > Robots > Robots in the Home (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)